Proceedings of the ECML / PKDD – 2003
نویسندگان
چکیده
This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and exploits named entity information. We introduce the idea of post-processing the extraction results for resolving ambiguous facts and improve the overall extraction performance. Postprocessing involves the exploitation of two additional sources of information: fact transition probabilities, based on a trained bigram model, and confidence probabilities, estimated for each fact by the wrapper induction system. A multiplicative model that is based on the product of those two probabilities is also considered for post-processing. Experiments were conducted on pages describing laptop products, collected from many different sites and in four different languages. The results highlight the effectiveness of our approach.
منابع مشابه
’ introduction : special issue of the ECML /
This special issue is a collection of papers submitted to the ECML/PKDD 2013 and 2014 journal tracks and accepted for publication in “Machine Learning”. TheEuropeanConference onMachineLearning andPrinciples andPractice ofKnowledge Discovery in Databases, ECML/PKDD, launched its journal track in 2013. In order to cover the full scope of the conference, which is a merger of the formerly independe...
متن کاملReliability Maps: A Tool to Enhance Probability Estimates and Improve Classification Accuracy
Probability Estimates and Improve Classification Accuracy (Best paper award). In T. Calders, F. Esposito, E. Hullermeier, & R. Meo (Eds.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part II (pp. 18-33). (Lecture Notes in Artificial Intelligence; Vol. 8725). Springer Berlin Heidelberg, 2010. DOI: ...
متن کاملMachine Learning: ECML 2005, 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005, Proceedings
It sounds good when knowing the machine learning ecml 2005 16th european conference on machine learning porto portugal october 3 7 2005 proceedings lecture notes in computer in this website. This is one of the books that many people looking for. In the past, many people ask about this book as their favourite book to read and collect. And now, we present hat you need quickly. It seems to be so h...
متن کاملMADSPAM Consortium at the ECML/PKDD Discovery
We present here the contribution of the MADSPAM consortium to the ECML/PKDD Discovery Challenge 2010. The submitted method is based on both a RankBoost algorithm and on propagation techniques.
متن کاملMulti-Plant Photovoltaic Energy Forecasting Challenge: Second Place Solution
This paper presents the approach we took to solve the MultiPlant Photovoltaic Energy Forecasting Challenge for ECML/PKDD 2017. The approach we took granted us the second place of that challenge. In the paper, we will present how we moved from standard regression techniques to simple function optimization to tackle the challenge.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003